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CLAIMS 



(57) [Claim(s)] 

[Claim 1] The standard-pattern creation approach characterized by creating the standard 
pattern which makes the mixed continuous distribution with which the plurality of the vector 
output probability continuous distribution as which the vector output probability of said standard 
pattern is expressed in continuous distribution was mixed with weight in the creation approach of 
the standard pattern defined by the set of a condition, the transition probability between 
conditions, a condition, or the vector output probability of transition a condition or the vector 
output probability of transition [claim 2] In the creation approach of the standard pattern defined 
by the set of a condition, the transition probability between conditions, a condition, or the vector 
output probability of transition It learns using the speakers voice data for every speaker about 
two or more speakers. The standard pattern which makes the mixed continuous distribution with 
which the plurality of the condition that the standard pattern by which the created vector output 
probability is expressed with continuous distribution corresponds, or the vector output 
probability continuous distribution of transition was mixed with weight a condition or the vector 
output probability of transition The standard-pattern creation approach characterized by 
creating [claim 3] In the creation approach of the standard pattern defined by the set of a 
condition, the transition probability between conditions, a condition, or the vector output 
probability of transition It learns using the voice data uttered or recorded in a different 
environment. The standard pattern which makes the mixed continuous distribution with which 
the plurality of the condition that the standard pattern by which the created vector output 
probability is expressed with continuous distribution corresponds, or the vector output 
probability continuous distribution of transition was mixed with weight a condition or the vector 
output probability of transition The standard-pattern creation approach characterized by 
creating 
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DETAILED DESCRIPTION 



[Detailed Description of the Invention] 
[Industrial Application] 

This invention relates to the creation approach of the standard pattern used for pattern 
recognition, such as speech recognition. 
[Description of the Prior Art] 

In the field of pattern recognition, such as speech recognition, the approach using a probability 
model as a standard pattern for recognition attracts attention in recent years, and especially the 
hidden Markov model (it calls Following HMM) is widely used as a model which expresses a 
standard pattern in the field of speech recognition. 

HMM is defined by the set of a condition, the transition probability between conditions, a 
condition, or the vector output probability of transition, and recognizes by calculating each 
likelihood of HMM to an input configuration. The speech recognition by HMM is stated to 
publication "speech recognition by probability model" Nakagawa [ Seiichi ] work in detail. 
The approach with which a parameter is repeated and updated, using the data for study as an 
approach of determining the parameter of the HMM model by which the vector output probability 
of each condition (or transition) is expressed with mixed continuous distribution is known from 
some initial value, such as a Baum-Welch algorithm. In this case, it is necessary to determine the 
initial value of parameters, such as the average of output probability distribution, for each [ to 
mix ] the distribution of every. It is the Institute of Electronics, Information and Communication 
Engineers voice study group data SP 89-48 which act on the parameter in distribution of the (b) 
single given by the (a) random numbers by obscuring with a random-number value as an 
approach of giving the initial value of these parameters ("examination of the Japanese phoneme 
recognition by the consecutive output distribution pattern HMM"). 
Which approach is learned. 

As an approach of on the other hand determining a direct parameter from study data rather than 
asking by updating from a certain initial value (c) It asks for the cluster of the number of 
distribution clustered and mixed after carrying out the segmentation of the study data. 
Parameters, such as the average, from the data of each class The approach of searching for is 
learned (). [ "High Performance Connected Digit Recognition Using Hidden Markov Models", ] 
[ IEEE Transaction ] on Acoustics, Speech, and Signal Processing, Vol.37, No.8, pp.1 21 4-1 224, 
and August 1989. Thus, it can also update with a Baum-Welch algorithm etc. by making the 
decided value into initial value. 
[Problem(s) to be Solved by the Invention] 

Although it is known that a setup of initial value is important in order to perform study efficiently 
when using the approach of updating a parameter repeatedly by study, possibility that will use a 
random number as shown in (a), or will take time amount by convergence of study in using the 
parameter in single distribution as shown in (b), and a convergence value will also turn into the 
whole local optimum value instead of an optimum value is high. On the other hand, although it 
was thought that the approach of (c) was converged by the count of a repeat small even when 
the study for renewal of a parameter is not necessarily needed and it uses as initial value of 
updating, there was a fault that needed to be calculated etc. for clustering and computational 
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complexity increased. 

The purpose of this invention is to offer the standard-pattern creation approach which canceled 
such a fault. 

[The means for solving a technical problem] 

[n the creation approach of a standard pattern that the 1st invention is defined by the set of a 
condition, the transition probability between conditions, a condition, or the vector output 
probability of transition A vector output probability is characterized by creating the standard 
pattern which makes the mixed continuous distribution which mixed with weight the condition 
that two or more standard patterns expressed with continuous distribution correspond, or the 
vector output probability distribution of transition a condition or the vector output probability of 
transition. 

In the creation approach of the standard pattern for speech recognition that the 2nd invention is 
defined by the set of a condition, the transition probability between conditions, a condition, or 
the vector output probability of transition It learns using the speaker's voice data for every 
speaker about two or more speakers. The created vector output probability is characterized by 
creating the standard pattern which makes the mixed continuous distribution which mixed with 
weight the condition that the standard pattern expressed with continuous distribution 
corresponds, or the vector output probability distribution of transition a condition or the vector 
output probability of transition. 

In the creation approach of the standard pattern for speech recognition that the 3rd invention is 
defined by the set of a condition, the transition probability between conditions, a condition, or 
the vector output probability of transition It learns for every environment using the voice data 
uttered or recorded in a different environment. The created vector output probability is 
characterized by creating the standard pattern which makes the mixed continuous distribution 
which mixed with weight the condition that the standard pattern expressed with continuous 
distribution corresponds, or the vector output probability distribution of transition a condition or 
the vector output probability of transition. 
[Function] 

According to this invention, the parameter of a standard pattern can be simply determined by 
compounding and searching for the vector output probability distribution expressed with mixed 
continuous distribution from the vector output probability distribution of two or more standard 
patterns [ finishing / study / already ]. Moreover, if the standard pattern used for composition is 
chosen appropriately, when using as initial parameters of study, such as the Baum-Welch 
method, it is expected that the probability which converges by the small count of study 
compared with the case where a random number determines an initial parameter etc., and is 
converged on a local optimum value also becomes small. Moreover, renewal of a parameter by 
study cannot be performed, but it can also use as it is. 

If the standard pattern learned and created by composition using the speakers voice data for 
every speaker about two or more speakers is used like the 2nd invention, a vector output 
probability can create simply the standard pattern for speaker independent speech recognition 
expressed with mixed consecutive output distribution. 

If the standard pattern learned and created for every environment like the 3rd invention using 
the voice data uttered or recorded in an environment which is different in composition is used, a 
standard pattern strong against fluctuation of the environment where a vector output probability 
is expressed with mixed consecutive output distribution can be created simply. 
[Example] 

Fig. 1 is a block diagram for explaining the example which applied the 1st invention to the HMM 
model creation for speaker independent speech recognition. HMM model [ from Speaker's B 
study data (2) ] B (4) is created for HMM model A (3) from Speaker's A study data (1 ). As 
speakers A and B, it chooses one standard speaker at a time, for example from a male and a 
woman, and uses. Let a HMM model be the model of a form as shown in Fig. 2 . State transition 
probability aii and the output probability distribution bi (y) over aii+1 (aii+aii+1=1) and the output 
vector y are defined to each condition i. The state transition probability of Model A and output 
probability distribution are expressed as aA, bA (y), etc., respectively. Supposing output vector 
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probability distribution is ^l^^ssed with single Gaussian distributio^^^(y) =N (y, muiA t sigmaiA) 
biB(y) =N (y, muiB, sigmaiB) 

It is expressed. Here, N (y, mui, sigmai) expresses the multi-dimension Gaussian distribution to 
which the mean vector is set to mui and it sets a covariance matrix to sigmai. From Model A and 
Model B, the HMM model C for speaker independent speech recognition (5) is created. State 
transition probability of Model C is set to aiiC and aii+1C, and output probability distribution is 
set to biC. Output probability distribution presupposes that it is expressed with mixed Gaussian 
distribution with two following mixing. 
biC(y) =lambda1N (y, muil, sigmail) 
+ lambda2N (y, mui2, sigmai2) 

At this time, each parameter of Model C is defined as follows. 

aiiC= {aiiA+aiiB}/2 aii+1C= {aii+1 A+aii+1B}/2 mui1=muiA, sigmai 1=sigmai A mui2=muiB t 
sigmai2=sigmaiB lambda1=lambda2=1/2 The model C created by doing in this way as the HMM 
model then for speaker independent speech recognition — it can also use — furthermore, many 
speakers* study data (6) — using — Baum-Welch — it can learn by law etc. and can also use as 
an early model for creating better model C (7). 

When what output probability distribution is expressed with mixed Gaussian distribution to as 
models A and B is prepared, Model C can be created similarly. In this case, the number of mixing 
of the output probability distribution of Model C becomes the sum of the number of mixing of the 
output probability distribution of Models A and B. 

Next, one example of the 2nd invention is explained. By clustering the voice data of the a small 
number of vocabulary which many speakers uttered, a speaker is divided into M clusters and M 
speakers based on clusters are chosen from each cluster. About each M speakers, the HMM 
model by which output probability distribution is expressed according to single Gaussian 
distribution is learned and created based on the voice data of a complement to HMM study. The 
HMM model for speaker independent speech recognition is obtained from M created models by 
creating the HMM model with which the number of mixing makes the mixed Gaussian distribution 
of M output probability distribution like the example of the 1st invention. Since the data used for 
the clustering for choosing M speakers are good by a small number of data, computational 
complexity decreases compared with (c) of a Prior art. 

One example of the 3rd invention is explained to the last In the example of the 1st invention, if 
the model learned using the data uttered as how to choose Models A and B under the 
environment where a certain speakers differ (for example, a quiet environment and an 
environment with many noises) is used, a recognition model strong against environmental 
fluctuation as a model C can be created. 
[Effect of the Invention] 

As stated above, according to the 1st invention, using two or more standard patterns already 
learned, a vector output probability can determine easily the parameter of the standard pattern 
expressed with mixed continuous distribution, and can use for pattern recognition by study of 
the a small number of time which made initial value to remain as it is or this value. Moreover, 
according to the 2nd and 3rd invention, the object for unspecified speakers and a standard 
pattern strong against environmental fluctuation can be created simply, respectively. 
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DESCRIPTION OF DRAWINGS 



[Brief Description of the Drawings] 

Fig. 1 is a block diagram for explaining the example which applied the 1st invention to the HMM 

model creation for speaker independent speech recognition, 

Fig. 2 is drawing showing the form of the HMM model in an example. 

1 .... Speakers A study data 

2 .... Speakers B study data 

3 .... HMM model A 

4 .... HMM model B 

5 .... HMM model C 

6 .... An a large number speakers study data 

7 .... HMM model C 
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